This small dataset compares spectral measures generated by both PraatSauce v0.2.2 and VoiceSauce v1.31 at 1 msec intervals for 9 Madurese lexical items spoken by a single male speaker. (There are 12 items included, but here we set aside the centralised vowels for the time being.) The original audio files, included here, are from Misnadin and Kirby (forthcoming). For both scripts, 5 formants were estimated with a maximum formant frequency of 5000 Hz; minimum and maximum F0 values were set to 50 Hz and 300 Hz for all F0 estimators. For VoiceSauce, the STRAIGHT F0 estimate and Snack formant/bandwidth estimates were used for harmonic amplitude corrections.
The method column indicates whether the formant bandwidths were estimated using Praat (PraatSauce) or Snack (VoiceSauce), or whether the Hawks and Miller formula was used.
Madurese has a consonant-vowel co-occurrence restriction whereby vowels are organized into high/non-high pairs. The high member of each pair follows voiced and voiceless aspirated stops, while the non-high member follows the voiceless unaspirated member. The items analysed here represent exemplars of each vowel quality with a bilabial onset, in order to explore the effects of the spectral harmonic correction implementations.
head(df)
## Filename Item Gloss seg_Start seg_End t_ms t method
## 1 baca-read baca read 99.119 200.695 99.119 0.00000000 formula
## 2 baca-read baca read 99.119 200.695 100.119 0.00990099 formula
## 3 baca-read baca read 99.119 200.695 101.119 0.01980198 formula
## 4 baca-read baca read 99.119 200.695 102.119 0.02970297 formula
## 5 baca-read baca read 99.119 200.695 103.119 0.03960396 formula
## 6 baca-read baca read 99.119 200.695 104.119 0.04950495 formula
## script measure value corrected
## 1 PraatSauce pF0 126.646 uncorrected
## 2 PraatSauce pF0 126.571 uncorrected
## 3 PraatSauce pF0 126.497 uncorrected
## 4 PraatSauce pF0 126.422 uncorrected
## 5 PraatSauce pF0 126.347 uncorrected
## 6 PraatSauce pF0 126.273 uncorrected
In the plots which follow, the PraatSauce measures are unsmoothed. If you want to compare to smoothed estimates, uncomment the two lines:
ps.fbw <- cbind(ps.fbw[1:6], apply(ps.fbw[7:43], 2, filter, filter=f21, sides=2))
ps.ebw <- cbind(ps.ebw[1:6], apply(ps.ebw[7:43], 2, filter, filter=f21, sides=2))
If you want to smooth the Matlab way, use the lag kernel by selecting filter=f20 and set sides=1.
It is not clear why VoiceSauce’s pF0 (Praat) estimate differs so dramatically from the PraatSauce estimate, considering they should be using the same estimator with the same settings.
Compare with the VoiceSauce Snack estimates:
PraatSauce estimated bandwidths are huge…
Here, both Praat-based estimators seem to be sync.
PraatSauce estimates not completely off from Snack (if they really are Snack estimates).
Note that the choice of bandwidth estimator is irrelevant here.
For reasons I have not been able to work out, PraatSauce estimates are consistently 20-25 dB higher than VoiceSauce estimates. This doesn’t matter for the spectral differences, but it would be good to work out what the source of the difference is – probably some kind of amplitude normalization/attenuation being done by VoiceSauce somewhere.
Note that the VoiceSauce estimates are sometimes negative, which seems…strange.
Here, choice of formant bandwidth estimator potentially matters.
In these plots, PraatSauce is using Praat and VoiceSauce is using Snack estimates.
For VoiceSauce, no real difference can be observed:
For PraatSauce, using the formula bandwidths makes only minor differences:
Again for VS the choice of bandwidth estimate doesn’t seem to matter:
For PraatSauce things are less clear:
Largest difference for H4.
Largest differences for A1 and A2.
More interesting is probably a comparison of the corrected differences:
Very similar.
Praat(Sauce) estimates are comparable if smoothed.
Here just showing HNR05 and HNR15 for clarity.
Again, the Praat estimates differ in amplitude, but maintain roughly the same trajectories. However, the PraatSauce implementation is much less sophisticated than that of VoiceSauce, and relies entirely on Praat’s To Harmonicity... function. There does not appear to be much difference in the bands as PraatSauce estimates them.